Word clustering based on co-occurrence information in English GIS textbooks
نویسندگان
چکیده
منابع مشابه
Word Clustering and Disambiguat ion Based on Co-occurrence Data
We address the problem of clustering words (or constructing a thesaurus) based on co-occurrence data, and using the acquired word classes to improve the accuracy of syntactic disambiguation. We view this problem as that of estimating a joint probability distribution specifying the joint probabilities of word pairs, such as noun verb pairs. We propose an efficient algorithm based on the Minimum ...
متن کاملWord Clustering and Disambiguation Based on Co-occurrence Data1
We address the problem of clustering words (or constructing a thesaurus) based on cooccurrence data, and conducting syntactic disambiguation by using the acquired word classes. We view the clustering problem as that of estimating a class-based probability distribution specifying the joint probabilities of word pairs. We propose an efficient algorithm based on the Minimum Description Length (MDL...
متن کاملClustering Co-occurrence Graph based on Transitivity
Word co-occurrences form a graph, regarding words as nodes and co-occurrence relations as branches. Thus, a co-occurrence graph can be constructed by co-occurrence relations in a corpus. This paper discusses a clustering method of the co-occurrence graph, the decomposition of the graph, from a graph-theoretical viewpoint. Since one of the applications for the clustering results is the ambiguity...
متن کاملExtracting Word Correspondences from Bilingual Corpora Based on Word Co-occurrence Information
A new method has been developed for extracting word correspondences from a bilingual corpus. First, the co-occurrence infi~rmation for each word in both languages is extracted li'om the corpus. Then, the correlations between the co-occurrence features of the words are calculated pairwisely with tile assistance of a basic word bilingual dictionary. Finally, the pairs of words with the highest co...
متن کاملImproving semantic topic clustering for search queries with word co-occurrence and bigraph co-clustering
Uncovering common themes from a large number of unorganized search queries is a primary step to mine insights about aggregated user interests. Common topic modeling techniques for document modeling often face sparsity problems with search query data as these are much shorter than documents. We present two novel techniques that can discover semantically meaningful topics in search queries: i) wo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theory and Applications of GIS
سال: 2007
ISSN: 1340-5381
DOI: 10.5638/thagis.15.129